In 2018, the state of California passed legislation related to water conservation and drought planning. This legislation will require all urban water retailers to meet state-defined annual water use targets. To meet these objectives, retailers will need to more closely track the different components of their water use budgets (e.g. residential indoor, residential outdoor, commercial outdoor, etc.).

This project will focus on commercial outdoor water budgets. For the purposes of meeting state water use objectives, a retailer only needs to consider commercial properties (e.g. parks, golf courses, universities, hospitals) which have a water meter dedicated entirely to irrigation. If a commercial property has a large landscaped area, it is more likely to have a dedicated irrigation meter. All other commercial outdoor water use can be ignored.

The outdoor budget for these commercial properties is calculated by multiplying the total outdoor landscaped area of these properties by a standard evapotranspiration (ET) value. This ET value is set by the state based on their definition of "efficient water use." It is up to the retailer to determine their total commercial outdoor landscaped area to insert into the equation.
This project will outline an approach for retailers to remotely calculate the total landscaped area of commercial properties using classified 1 meter resolution NAIP imagery, county parcel data, and point locations of irrigation meters.
The major challenge in this process will be the land cover classification of the NAIP imagery to identify vegetation pixels. Details on the land cover classification approach can be found in section 1.5.
The approach will initially be developed for a small area within the greater Long Beach service area. Once the individual steps have been successfully established, the workflow can then be scaled to analyze an entire service area. The location of the Long Beach service area is shown below.
# Import packages
import os
import numpy as np
import numpy.ma as ma
import geopandas as gpd
import matplotlib.pyplot as plt
import earthpy as et
import earthpy.plot as ep
import rioxarray as rxr
import rasterio as rio
from rasterio.plot import plotting_extent
import contextily as cx
from shapely.geometry import Polygon
# Set working directory
os.chdir(os.path.join(et.io.HOME,
'earth-analytics',
'data',
'project-data'))
# Define functions used in this workflow
def clean_array_plot(xr_obj):
"""Takes a single xarray object as an input and produces a cleaned numpy
array output for plotting.
Parameters
----------
xr_obj : xarray DataArray
xarray object containing null values
Returns
----------
masked_array : numpy array
masked numpy array
"""
masked_array = ma.masked_array(xr_obj.values, xr_obj.isnull())
return masked_array
# Import shapefile with all supplier service areas in southern California
servicearea_path = os.path.join("Retail_Agency_spatial_files",
"_RetailAgencies.shp")
servicearea = gpd.read_file(servicearea_path)
# Define just Long Beach service area
lb_servicearea = servicearea[servicearea['MEMBERABBR']=='LB']
# Check out where Long Beach service area is on a map!
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(18, 12))
servicearea.to_crs('EPSG:3857').plot(color='grey',
alpha=0,
ax=ax1)
lb_servicearea.to_crs('EPSG:3857').plot(color='gold',
alpha=0.5,
linewidth=1,
edgecolor='k',
ax=ax1)
lb_servicearea.to_crs('EPSG:3857').plot(color='gold',
alpha=0.5,
linewidth=1,
edgecolor='k',
ax=ax2)
cx.add_basemap(ax1)
cx.add_basemap(ax2)
ax1.set_axis_off()
ax2.set_axis_off()
ax1.set_title('Location of Long Beach Service Area in Southern California',
fontsize=18)
ax2.set_title('Detailed look at Long Beach Service Area',
fontsize=18)
plt.tight_layout()
This approach will use 2020 NAIP data. For the initial workflow development, only 1 NAIP tile will be used. Each individual image covers a 3.75 x 3.75 minute quarter quadrangle plus a 300 meter buffer on all four sides. This is a small subset of the Long Beach service area, as shown below.
# Open test NAIP file
tile_path = os.path.join('naip',
'm_3311816_sw_11_060_20200505',
'm_3311816_sw_11_060_20200505.tif')
test_tile = rxr.open_rasterio(tile_path)
# Define bounds of test tile and create polygon of tile footprint
tile_bounds = test_tile.rio.bounds()
lat_point_list = [tile_bounds[1], tile_bounds[1], tile_bounds[3], tile_bounds[3]]
lon_point_list = [tile_bounds[0], tile_bounds[2], tile_bounds[2], tile_bounds[0]]
polygon_geom = Polygon(zip(lon_point_list, lat_point_list))
crs = test_tile.rio.crs
tile_polygon = gpd.GeoDataFrame(index=[0], crs=crs, geometry=[polygon_geom])
# Plot tile over Long Beach service area boundary
fig, ax = plt.subplots(figsize=(15, 12))
lb_servicearea.to_crs(polygon.crs).plot(color='gold',
alpha=0.5,
linewidth=1,
edgecolor='k',
ax=ax)
tile_polygon.plot(ax=ax,
color='white',
edgecolor='r',
linewidth=2,
alpha=.5)
ax.set_title('Location of Test Tile within Long Beach Service Area',
fontsize=18)
ax.set_axis_off()
Notice that the tile doesn't entirely overlap the Long Beach service area. For the purposes of this exercise, we will do the analysis on the entire NAIP test tile so that we have more area to work with. In the future, we can crop the final shapefile of the classified properties to the service area boundary.
fig, ax = plt.subplots(figsize=(10, 15))
ep.plot_rgb(clean_array_plot(test_tile),
rgb=[0, 1, 2],
ax=ax)
ax.set_title('2020 NAIP imagery for test location near Long Beach, CA',
fontsize=18)
ax.set_axis_off()
Notice above that there are several properties with large landscaped areas. Golf courses, a park, and sports fields are identifiable from this view. These are the landscaped areas we are interested in identifying and calculating their associated areas.
LA County maintains a shapefile of all parcels in the county. This file also contains information about the property, including whether it is a commercial or residential property. We can overlay a shapefile of commercial property boundaries onto our test tile image.
parcels_path = os.path.join('parcels',
'Parcels.shp')
la_parcels = gpd.read_file(parcels_path)
la_parcels_reproj = la_parcels.to_crs(tile_polygon.crs)
# clipped_parcels = gpd.clip(la_parcels_reproj, tile_polygon)
tile_extent = plotting_extent(test_tile[0],
test_tile.rio.transform())
fig, ax = plt.subplots(figsize=(12,10))
ep.plot_rgb(clean_array_plot(test_tile),
rgb=[0, 1, 2],
ax=ax,
extent=tile_extent)
la_parcels_reproj.plot(ax=ax)
ax.set_title('Parcels boundaries in test tile area (and beyond)',
fontsize=18)
Text(0.5, 1.0, 'Parcels boundaries in test tile area (and beyond)')
I finally got the crazy LA County ArcGIS portal to spit out some parcel data, but it seems to have only exported those within the extent of the Online Map Viewer. I then had to open the layer in ArcGIS Pro desktop, then export it as a shapefile to import here. I'll work on this more.
# Subset to just commercial, industrial, institutional (non-residential) parcels
cii_parcels = la_parcels_reproj[(la_parcels_reproj['UseType'] == 'Recreational') |
(la_parcels_reproj['UseType'] == 'Commercial') |
(la_parcels_reproj['UseType'] == 'Institutional') |
(la_parcels_reproj['UseType'] == 'Industrial') |
(la_parcels_reproj['UseType'] == 'Government')]
fig, ax = plt.subplots(figsize=(12,10))
ep.plot_rgb(clean_array_plot(test_tile),
rgb=[0, 1, 2],
ax=ax,
extent=tile_extent)
cii_parcels.plot(ax=ax)
ax.set_title('Commercial, Instititutional, Industrial, Recreational and Government parcels',
fontsize=18)
Text(0.5, 1.0, 'Commercial, Instititutional, Industrial, Recreational and Government parcels')
This cii_parcel shapefile will eventually be used to calculate the landscaped area once the NAIP imagery is classified.
For the land cover classification, I will use the RandomForest (Brieman 2001) ensemble decision tree algorithm by Leo Breiman and Adele Cutler, outlined in the random-forest-classification tutorial. RandomForest utilizes the scikit-learn Python library.
This is a supervised classification approach which inputs training polygons. The training polygons were manually created in ArcGIS Pro using the NAIP test tile as guidance, and assigned a classification of either 1-Turf or 2-Shrubs/Trees.
# Open training polygons
training_path = os.path.join('training',
'training_polys1.shp')
training_polys = gpd.read_file(training_path)
training_polys_reproj = training_polys.to_crs(tile_polygon.crs)
# Plot training polygons on NAIP test tile
colors = {'Turf': 'lightgreen',
'Trees/Shrubs': 'gold'}
fig, ax = plt.subplots(figsize=(20, 20))
ep.plot_rgb(clean_array_plot(test_tile),
rgb=[0, 1, 2],
ax=ax,
extent=tile_extent)
for cover_type, df in training_polys_reproj.groupby('DESCRIPTIO'):
color = colors[cover_type]
df.plot(color=color,
ax=ax,
label=cover_type)
ax.set_title('Training polygons over NAIP test tile\nGreen = Turf, Orange = Shrubs/Trees',
fontsize=18)
Text(0.5, 1.0, 'Training polygons over NAIP test tile\nGreen = Turf, Orange = Shrubs/Trees')
# Zoom in to see details of training polygons
colors = {'Turf': 'lightgreen',
'Trees/Shrubs': 'gold'}
fig, ax = plt.subplots(figsize=(10, 10))
ep.plot_rgb(clean_array_plot(test_tile),
rgb=[0, 1, 2],
ax=ax,
extent=tile_extent)
for cover_type, df in training_polys_reproj.groupby('DESCRIPTIO'):
color = colors[cover_type]
df.plot(color=color,
ax=ax,
label=cover_type)
ax.set_xlim([396600, 397400])
ax.set_ylim([3738300, 3739100])
ax.set_title('Training polygons over NAIP test tile\nGreen = Turf, Orange = Shrubs/Trees',
fontsize=18)
Text(0.5, 1.0, 'Training polygons over NAIP test tile\nGreen = Turf, Orange = Shrubs/Trees')
The training polygons can now be incorporated into the RandomForest workflow. Next steps include: